Executive Summary

DATA 2002: Data Analytics- Learning from Data is an intermediate unit of study at the University of Sydney. The unit aims to equip students with knowledge and skills that will enable them to embrace data analytic challenges stemming from everyday problems.

This report investigates whether the data on the weight of the mice differ for transgenic and nontransgenic mice and if so, which fragment/ fragments of the DNA are the cause.

We found significant statistical evidence that different DNA fragments affect different gender differently with 152F7 affect female mice’s weight most and 285E6 affect male mice’s weight the most.

Introduction

DATA 2002: Data Analytics- Learning from Data is an intermediate unit of study at the University of Sydney. The unit aims to equip students with knowledge and skills that will enable them to embrace data analytic challenges stemming from everyday problems.

As part of semester 2 2018 assessment of the unit of study, students are required to analyze the laboratory data obtained from the Human Genome Center at the Lawrence Berkeley Laboratory (Nolan, 2000).

Down Syndrome is a congenital syndrome that occurs when a child inherits an extra chromosome 21 from his or her parents. The syndrome is associated with some degree of physical and mental retardation and is one of the most common congenital syndromes. The population of people with Down syndrome in Australia is now over 13,000 (Down Syndrome Australia, 2018).

Scientists seek to identify the effectiveness of treatments to mitigate the syndrome, however testing on humans at a very early stage of the cure is unethical. This resulted in firstly trying to alter the DNA of mice to mimic the effect of Down Syndrome. However, as Down Syndrome is closely associated to weight gain, scientists seek to investigate the difference between the weights for transgenic and nontransgenic mice and if so, which fragment/ fragments of the DNA are the cause. This is not as trivial as it seems at first as there are potentially a lot of factors that might affect the weight of the mice such as the gender, cage, age, etc which are not of interest for the scientist but will affect the findings.

Data Background

This data is obtained from 11 of Nolan and Speed (2000). The Human Genome Center at the Lawrence Berkeley Laboratory constructed a panel of transgenic mice, each containing one of the four fragments of cloned human chromosome 21.

Data Cleaning

The variables in the dataset can be seen under Appendix A: Data Dictionary.

The format of the data is a bit unusual and a bit of data cleansing is required for the ease of analysis.

The variables tg and sex are not really integer variables. The data type in this data is adjusted for ease of analysis.

All non-transgenic mice are pooled together to provide a better estimate of the variability in weight. This was done by creating 5 categories, one for each of the four possible DNA fragments and one for the absence of any trisomy.

library(tidyverse)
x = read_table("https://raw.githubusercontent.com/DATA2002/data/master/mouse.txt")

x = x %>% mutate(
  sex = if_else(sex == 1, "Male", "Female"),
  DNA = case_when(DNA == 1 ~ "141G6",
                  DNA == 2 ~ "152F7", 
                  DNA == 3 ~ "230E8",
                  DNA == 4 ~ "285E6"),
  cage = factor(cage),
  tg = if_else(tg == 1, "Transgenic", "Non-transgenic")
)

x = x %>% 
  mutate(
    DNAfragment = case_when(
      tg == "Transgenic" ~ DNA,
      TRUE ~ "No trisomy"
    )
  )

# check that number within each subgroup is reasonable in size
x %>% group_by(sex, tg, DNA) %>% count() %>% spread(key = DNA, value = n)
## # A tibble: 4 x 6
## # Groups:   sex, tg [4]
##   sex    tg             `141G6` `152F7` `230E8` `285E6`
##   <chr>  <chr>            <int>   <int>   <int>   <int>
## 1 Female Non-transgenic      43      51       9      48
## 2 Female Transgenic          40      31       7      38
## 3 Male   Non-transgenic      30      37       7      31
## 4 Male   Transgenic          64      39      14      43
x %>% group_by(tg, DNA) %>% count() %>% spread(key = DNA, value = n)
## # A tibble: 2 x 5
## # Groups:   tg [2]
##   tg             `141G6` `152F7` `230E8` `285E6`
##   <chr>            <int>   <int>   <int>   <int>
## 1 Non-transgenic      73      88      16      79
## 2 Transgenic         104      70      21      81
# check the numbers match up
x %>% count(DNAfragment)
## # A tibble: 5 x 2
##   DNAfragment     n
##   <chr>       <int>
## 1 141G6         104
## 2 152F7          70
## 3 230E8          21
## 4 285E6          81
## 5 No trisomy    256
x %>% group_by(tg, DNA) %>% count()
## # A tibble: 8 x 3
## # Groups:   tg, DNA [8]
##   tg             DNA       n
##   <chr>          <chr> <int>
## 1 Non-transgenic 141G6    73
## 2 Non-transgenic 152F7    88
## 3 Non-transgenic 230E8    16
## 4 Non-transgenic 285E6    79
## 5 Transgenic     141G6   104
## 6 Transgenic     152F7    70
## 7 Transgenic     230E8    21
## 8 Transgenic     285E6    81

Relationships of Mice Variables (Transgenic, Gender and DNA) with Weight

Transgenic and Not Transgenic Mice

Data visualization of transgenic and not transgenic mice with respect to their corresponding weights:

ggplot(x, aes(y = weight)) + 
  geom_boxplot() + 
  theme_bw() + 
  facet_grid( ~ tg)

From the box plot, we can see two variables range are similar. The interquartile range is close as well. Mouse with transgenic has a slightly higher median weight than non-transgenic.

Male and Female Mice

Data visualization of the gender of mice grouped by whether they are transgenic or not with respect to their corresponding weights:

ggplot(x, aes(y = weight)) + 
  geom_boxplot() + 
  theme_bw() + 
  facet_grid(sex ~ tg)

The comparison of female transgenic and non-transgenic mouse weight shows that the data range and median are really close. Male transgenic and non-transgenic mouse’s weight are similar as well.

Data visualization of gender of mice with respect to their corresponding weights:

ggplot(x, aes(y = weight)) + 
  geom_boxplot() + 
  theme_bw() + 
  facet_grid(tg ~ sex)

From the plot, the mouse’s weight is heavier than female mouse under transgenic and non-transgenic conditions.

DNA Fragment

Data visualization of DNA fragment with respect to their corresponding weights:

ggplot(x, aes(y = weight)) + 
  geom_boxplot() + 
  theme_bw() + 
  facet_grid( ~ DNA)

The weights between the four fragments of DNA are similar in range with 285E6’s median weight slightly lower and 152F7 has the highest median and the interquartile range.

Summary Statistics for Transgenic, Gender and DNA

Summary statistics for transgenic and non-transgenic with weight:

x %>%
  group_by(tg) %>%
  summarise(Count = n(), Mean = mean(weight), Variance = var(weight))
## # A tibble: 2 x 4
##   tg             Count  Mean Variance
##   <chr>          <int> <dbl>    <dbl>
## 1 Non-transgenic   256  28.4     15.5
## 2 Transgenic       276  29.5     14.9

Summary statistics for gender with weight:

x %>%
  group_by(sex) %>%
  summarise(count = n(), mean = mean(weight), variance = var(weight))
## # A tibble: 2 x 4
##   sex    count  mean variance
##   <chr>  <int> <dbl>    <dbl>
## 1 Female   267  26.0     5.09
## 2 Male     265  31.9     8.39

Summary statistics for transgenic and non-transgenic and gender with weight:

x %>%
  group_by(sex, tg) %>%
  summarise(count = n(), mean = mean(weight), variance = var(weight))
## # A tibble: 4 x 5
## # Groups:   sex [?]
##   sex    tg             count  mean variance
##   <chr>  <chr>          <int> <dbl>    <dbl>
## 1 Female Non-transgenic   151  25.9     4.83
## 2 Female Transgenic       116  26.2     5.44
## 3 Male   Non-transgenic   105  31.9     9.28
## 4 Male   Transgenic       160  31.9     7.86

Summary statistics for transgenic and non-transgenic and gender and DNA fragments with weight:

x %>%
  group_by(DNA, sex) %>%
  summarise(count = n(), mean = mean(weight), variance = var(weight))
## # A tibble: 8 x 5
## # Groups:   DNA [?]
##   DNA   sex    count  mean variance
##   <chr> <chr>  <int> <dbl>    <dbl>
## 1 141G6 Female    83  25.2     4.03
## 2 141G6 Male      94  31.7     6.65
## 3 152F7 Female    82  27.5     6.36
## 4 152F7 Male      76  33.2     7.99
## 5 230E8 Female    16  25.6     2.23
## 6 230E8 Male      21  32.4     7.50
## 7 285E6 Female    86  25.4     2.58
## 8 285E6 Male      74  30.7     8.13

Is there any evidence that DNA fragment affects the weight of the mice?

DNA Fragments with Weight

Checking for normality and variance assumption

# anova test
x %>% count(DNAfragment)
## # A tibble: 5 x 2
##   DNAfragment     n
##   <chr>       <int>
## 1 141G6         104
## 2 152F7          70
## 3 230E8          21
## 4 285E6          81
## 5 No trisomy    256
weight_anova=aov(weight ~ DNAfragment, data = x)
summary(weight_anova)
##              Df Sum Sq Mean Sq F value   Pr(>F)    
## DNAfragment   4    510   127.5   8.736 7.82e-07 ***
## Residuals   527   7693    14.6                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# check the assumption
library(ggplot2)
library(ggfortify)
autoplot(weight_anova,which=c(1,2))

As the data almost has a perfectly horizontal line, the same variance assumption is valid. The data falls very close to the normal qqline thus, the normality assumption is valid and an ANOVA test can be performed.

Based on the background reading, the expression of the inserted DNA does not always occur, so we make a pool of inserted DNA fragment, and all of the non-transgenic are allocated into the no trisomy group.

# plot the weight ~ DNAfragment (pooling)
ggplot(x, aes(sample = weight)) +
  geom_qq() + geom_qq_line() +
  facet_wrap(~ DNAfragment) +
  theme_classic(base_size = 20)

ggplot(x, aes(x = DNAfragment, y = weight)) +
  geom_boxplot() + theme_classic(base_size = 20) +coord_flip()

Hypothesis

\(H_0: \mu_1 = \mu_2 = \mu_3 = \mu_4\)

\(H_1\): at least one \(\mu_i \neq \mu_j\) for \(i \neq j\)

Assumption Each group is normally distributed with equal variance, each group is independent

Test statistic: T = Treatment mean square/ Residual mean square. Under H0, T ~ F(g-1, N-g)

Observed test statistic: t0 = 127.5/ 14.6 = 8.736

P value: P = 7.82e-07

Conclusion

Since p-value is smaller than 0.05, we reject the null hypothesis. Therefore, at least two of the group means are not equal.

DNA Fragments (Male) with Weight

Checking for normality and variance assumption

a2 = aov(weight ~ DNA, data = filter(x, sex == "Male"))
summary(a2)
##              Df Sum Sq Mean Sq F value  Pr(>F)    
## DNA           3  253.6   84.52   11.24 5.8e-07 ***
## Residuals   261 1961.6    7.52                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
plot(a2, which = 1: 2)

As the data almost has a perfectly horizontal line, the same variance assumption is valid. The data falls very close to the normal qqline thus, the normality assumption is valid and an ANOVA test can be performed.

ANOVA Test

This report would like to explore the relationship of male with different DNA fragments with weight. The null hypothesis is that the weight of mice are same for different DNA fragments for male while the alternative hypothesis is that the weight of mice are different between different DNA fragments for male.

Hypothesis

\(H_0: \mu_1 = \mu_2 = \mu_3 = \mu_4\)

\(H_1\): at least one \(\mu_i \neq \mu_j\) for \(i \neq j\)

Assumptions

Observations are independent within each of the 4 samples. Each of the 4 populations have the same variances, \(\sigma_1^2 = \sigma_2^2 = \sigma_3^2 = \sigma_4^2 = \sigma\). Each of the 4 populations are normally distributed.

Test statistic

\(T = \frac{Treatment Mean Sq}{Residual Mean Sq}.\) Under \(H_0\), T ~ \(F_{{g-1},{N-g}}\) where g = 4 is the number of groups.

Observed Test statistic

\(t_0 = \frac{84.52}{7.52} = 11.2.\)

p-value

P(T>11.2) = 5.8 e-7

Conclusion

As the p-value is smaller than 0.05, we reject the null hypothesis as the data provides significant evidence to support \(H_1\). This suggests that the weights of male mice are different for different DNA fragments.

DNA Fragments (Female) with Weight

Checking for normality and variance assumption

a3 = aov(weight ~ DNA, data = filter(x, sex == "Female"))
summary(a3)
##              Df Sum Sq Mean Sq F value  Pr(>F)    
## DNA           3  256.1   85.37   20.45 6.1e-12 ***
## Residuals   263 1098.0    4.17                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
plot(a3, which = 1: 2)

As the residual plot shows a combination of increasing and decreasing trend, the same variance assumption is not valid. The data falls very close to the normal qqline thus, the normality assumption is valid and an ANOVA test cannot be performed. Therefore, a Welch test is performed instead.

Welch Test

This report would like to explore the relationship of female with different DNA fragments with weight. The null hypothesis is that the weight of mice are same for different DNA fragments for female while the alternative hypothesis is that the weight of mice are different between different DNA fragments for female.

female = filter(x, sex == "Female")
pairwise.t.test(female$weight, female$DNA, p.adjust.method = "bonferroni", pool.sd = FALSE)
## 
##  Pairwise comparisons using t tests with non-pooled SD 
## 
## data:  female$weight and female$DNA 
## 
##       141G6   152F7   230E8 
## 152F7 1.7e-08 -       -     
## 230E8 1.0000  0.0025  -     
## 285E6 1.0000  3.6e-08 1.0000
## 
## P value adjustment method: bonferroni

Hypothesis

\(H_0: \mu_1 = \mu_2 = \mu_3 = \mu_4\)

\(H_1\): at least one \(\mu_i \neq \mu_j\) for \(i \neq j\)

Conclusion

As the p-value is smaller than 0.05, we reject the null hypothesis as the data provides significant evidence to support \(H_1\). This suggests that the weights of female mice are different for different DNA fragments.

Is there any evidence that DNA fragment affects the weight of different gender mice differently (one-way ANOVA model)?

From above, we know that the mean weights of male and female mice are different. We also know that the effect of DNA fragment affects differently among the different gender.

This section seeks to explore the significance and if a particular DNA fragment affects male mice more than its counterpart and to find the greatest difference with its corresponding gender non-transgenic mice.

Using the multiple comparisons to find out the significant groups. We use the emmeans package and make the adjustment of bonferroni and scheff.

# multiple contrasts comparisions
library(emmeans)
weight_em=emmeans(weight_anova, ~ DNAfragment)
confint(pairs(weight_em, adjust = "bonferroni")) %>% plot() +
  theme_classic(base_size = 20) + geom_vline(xintercept = 0)

confint(pairs(weight_em, adjust = "scheff")) %>% plot() +
  theme_classic(base_size = 20) + geom_vline(xintercept = 0)

test(pairs(weight_em, adjust = "bonferroni"))
##  contrast             estimate        SE  df t.ratio p.value
##  141G6 - 152F7      -1.6598901 0.5906614 527  -2.810  0.0513
##  141G6 - 230E8      -1.0622711 0.9140255 527  -1.162  1.0000
##  141G6 - 285E6       1.2689459 0.5661823 527   2.241  0.2543
##  141G6 - No trisomy  0.9943510 0.4442672 527   2.238  0.2563
##  152F7 - 230E8       0.5976190 0.9505865 527   0.629  1.0000
##  152F7 - 285E6       2.9288360 0.6234858 527   4.698  <.0001
##  152F7 - No trisomy  2.6542411 0.5153110 527   5.151  <.0001
##  230E8 - 285E6       2.3312169 0.9355727 527   2.492  0.1302
##  230E8 - No trisomy  2.0566220 0.8672412 527   2.371  0.1808
##  285E6 - No trisomy -0.2745949 0.4870596 527  -0.564  1.0000
## 
## P value adjustment: bonferroni method for 10 tests
test(pairs(weight_em, adjust = "scheff"))
##  contrast             estimate        SE  df t.ratio p.value
##  141G6 - 152F7      -1.6598901 0.5906614 527  -2.810  0.0971
##  141G6 - 230E8      -1.0622711 0.9140255 527  -1.162  0.8526
##  141G6 - 285E6       1.2689459 0.5661823 527   2.241  0.2864
##  141G6 - No trisomy  0.9943510 0.4442672 527   2.238  0.2878
##  152F7 - 230E8       0.5976190 0.9505865 527   0.629  0.9828
##  152F7 - 285E6       2.9288360 0.6234858 527   4.698  0.0002
##  152F7 - No trisomy  2.6542411 0.5153110 527   5.151  <.0001
##  230E8 - 285E6       2.3312169 0.9355727 527   2.492  0.1858
##  230E8 - No trisomy  2.0566220 0.8672412 527   2.371  0.2307
##  285E6 - No trisomy -0.2745949 0.4870596 527  -0.564  0.9886
## 
## P value adjustment: scheffe method with dimensionality 4

Since the weight of male mice and female mice are different, we do the tests separately in order to reduce the effects of the sex.

Visual representation of the DNA fragments of different sex with weight:

# consider the sex
ggplot(x, aes(x=DNA, y=weight, colour=DNA)) +
  geom_boxplot() + theme_classic(base_size = 20) +
  facet_wrap( ~ sex)

x %>% group_by(sex,DNA) %>% count()
## # A tibble: 8 x 3
## # Groups:   sex, DNA [8]
##   sex    DNA       n
##   <chr>  <chr> <int>
## 1 Female 141G6    83
## 2 Female 152F7    82
## 3 Female 230E8    16
## 4 Female 285E6    86
## 5 Male   141G6    94
## 6 Male   152F7    76
## 7 Male   230E8    21
## 8 Male   285E6    74

Visual representation of the difference between the DNA fragments for male mice

x_m=filter(x, sex=='Male')
weight_m_anova=aov(weight ~ DNAfragment, data = x_m)
summary(weight_m_anova)
##              Df Sum Sq Mean Sq F value   Pr(>F)    
## DNAfragment   4  191.3   47.83   6.144 9.72e-05 ***
## Residuals   260 2023.9    7.78                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
weight_m_em=emmeans(weight_m_anova, ~DNAfragment)
confint(pairs(weight_m_em,adjust = 'bonferroni')) %>% plot() + geom_vline(xintercept = 0)

contrast(weight_m_em, method = 'pairwise', adjust = 'scheff')
##  contrast             estimate        SE  df t.ratio p.value
##  141G6 - 152F7      -1.1118189 0.5667634 260  -1.962  0.4288
##  141G6 - 230E8      -0.3515625 0.8231875 260  -0.427  0.9961
##  141G6 - 285E6       1.8519259 0.5501399 260   3.366  0.0251
##  141G6 - No trisomy  0.1708185 0.4424500 260   0.386  0.9973
##  152F7 - 230E8       0.7602564 0.8692547 260   0.875  0.9428
##  152F7 - 285E6       2.9637448 0.6169441 260   4.804  0.0002
##  152F7 - No trisomy  1.2826374 0.5231904 260   2.452  0.2018
##  230E8 - 285E6       2.2034884 0.8585086 260   2.567  0.1629
##  230E8 - No trisomy  0.5223810 0.7938168 260   0.658  0.9796
##  285E6 - No trisomy -1.6811074 0.5051350 260  -3.328  0.0279
## 
## P value adjustment: scheffe method with dimensionality 4

Using contrasts to find out the significant groups. The group 285E6 has the significant difference with No trisomy, 152F7 and 141G6 groups for male mice.

Visual representation of the difference between the DNA fragments for female mice

x_f=filter(x, sex=='Female')
weight_f_anova=aov(weight ~ DNAfragment, data = x_f)
summary(weight_f_anova)
##              Df Sum Sq Mean Sq F value   Pr(>F)    
## DNAfragment   4  208.1   52.02   11.89 6.81e-09 ***
## Residuals   262 1146.0    4.37                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
weight_f_em=emmeans(weight_f_anova, ~ DNAfragment)
confint(pairs(weight_f_em, adjust = 'bonferroni')) %>% plot() + geom_vline(xintercept = 0)

contrast(weight_f_em, method = 'pairwise', adjust = 'scheff')
##  contrast             estimate        SE  df t.ratio p.value
##  141G6 - 152F7      -3.2852419 0.5004563 262  -6.564  <.0001
##  141G6 - 230E8      -1.3889286 0.8568755 262  -1.621  0.6225
##  141G6 - 285E6      -0.6727632 0.4737763 262  -1.420  0.7328
##  141G6 - No trisomy -0.9088907 0.3719170 262  -2.444  0.2046
##  152F7 - 230E8       1.8963134 0.8752049 262   2.167  0.3228
##  152F7 - 285E6       2.6124788 0.5061739 262   5.161  <.0001
##  152F7 - No trisomy  2.3763512 0.4123958 262   5.762  <.0001
##  230E8 - 285E6       0.7161654 0.8602274 262   0.833  0.9520
##  230E8 - No trisomy  0.4800378 0.8086096 262   0.594  0.9861
##  285E6 - No trisomy -0.2361276 0.3795757 262  -0.622  0.9834
## 
## P value adjustment: scheffe method with dimensionality 4

In the female mice test, the group 152F7 has significant differences with the groups No trisomy, 285E6 and 141G6.

Conclusion

It is evident that different DNA fragments affect different gender differently and that the 285E6 DNA resulted in the significant weight differences in the male mice group while the 152F7 DNA resulted in significant weight differences among female mice.

Is there any evidence that DNA fragment affects the weight of different gender mice differently (two-way ANOVA model)?

This section explores the interaction effects between sex and DNA fragment.

a2 = aov(weight ~ DNAfragment * sex, data = x)
library(emmeans)
emmip(a2, DNAfragment ~ sex) + theme_classic(base_size = 16)

emmip(a2, sex ~ DNAfragment) + theme_classic(base_size = 16)

From the interaction plots above, we can infer that there is an interaction between the sex and the DNAfragment of the mice. This is shown from the intersections between the traces above. We can see that the only level from DNAfragment that shows a strong effect on gender is the 141G6 DNA.

# ANOVA
anova(a2)
## Analysis of Variance Table
## 
## Response: weight
##                  Df Sum Sq Mean Sq  F value    Pr(>F)    
## DNAfragment       4  510.1   127.5  20.9995 4.543e-16 ***
## sex               1 4434.7  4434.7 730.2710 < 2.2e-16 ***
## DNAfragment:sex   4   88.0    22.0   3.6216  0.006351 ** 
## Residuals       522 3169.9     6.1                       
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

As seen from the above two-way ANOVA, we can conclude that the sex and DNAfragment main effects are significant (there is a significant difference between the levels of each treatment group).

Post hoc Comparison

The pairwise difference t-statistics

# pairwise difference t-statistics
s_emm = contrast(emmeans(a2, ~sex), method = "pairwise", adjust = "scheffe")
d_emm = contrast(emmeans(a2, ~DNAfragment), method = "pairwise", adjust = "scheffe")

sd_emm = update(s_emm + d_emm)
sd_emm
##  contrast             estimate        SE  df t.ratio p.value
##  Female - Male      -5.7529000 0.3032264 522 -18.972  <.0001
##  141G6 - 152F7      -2.1985304 0.3867492 522  -5.685  <.0001
##  141G6 - 230E8      -0.8702455 0.6220886 522  -1.399  1.0000
##  141G6 - 285E6       0.5895814 0.3700440 522   1.593  1.0000
##  141G6 - No trisomy -0.3690361 0.2935775 522  -1.257  1.0000
##  152F7 - 230E8       1.3282849 0.6428212 522   2.066  0.4322
##  152F7 - 285E6       2.7881118 0.4039273 522   6.903  <.0001
##  152F7 - No trisomy  1.8294943 0.3352797 522   5.457  <.0001
##  230E8 - 285E6       1.4598269 0.6329112 522   2.307  0.2362
##  230E8 - No trisomy  0.5012094 0.5914658 522   0.847  1.0000
##  285E6 - No trisomy -0.9586175 0.3158640 522  -3.035  0.0278
## 
## Results are averaged over some or all of the levels of: DNAfragment, sex 
## P value adjustment: bonferroni method for 11 tests

Plot of pairwise difference with 95% confidence intervals

plot(sd_emm) + theme_classic(base_size = 16) + geom_vline(xintercept = 0) + labs(x = "Estimated pairwise mean difference", caption = "95% confidence intervals adjusted for multiple testing using Scheffe method")

For sex, the differences between female and male are highly significant where the weight of male mice is higher than the weight of female mice significantly.

For DNA, the differences between 141G6-152F7, 152F7-285E6, 152F7-No Trisomy and 285E6-No Trisomy are highly significant while the others are not.

This further justifies the conclusion arrived from Is there any evidence that DNA fragment affects the weight of different gender mice differently (one-way ANOVA model)? that the 152F7 and 285E6 DNA fragments have a significant effect on weight.

Summary

This report initially started to investigate which DNA fragment affects the weight gain for the mice.

Identifying the DNA fragment which caused the weight gain for mice might still only be the early stages of the research. There is still a long way to go before the knowledge can be applied to humans.

Our findings suggest that two different DNA fragment affects each gender differently which might seem counter-intuitive as it was initially thought that the weight is affected by only one DNA fragment.

Breakthroughs in genetics as this report present us with a promise and a predicament. It is a wildly debated issue and arguments for and against can be found under Appendix B: The Case of Gene Editing.

Appendix A: Data Dictionary

Variable name type definition
DNA integer Fragment of chromosome 21 integrated in parent mouse where 1 = 141G6, 2 = 152F7, 3 = 230E8, 4 = 285E6.
line character Family line
tg integer Whether the mouse contains the extra DNA and is therefore transgenic (1) or not transgenic (0)
sex integer Mouse gender 1 = male, 0 = female.
age integer Age of mouse (in days) at the time of weighing.
weight integer Weight of mouse in grams, to the nearest tenth of a gram.
cage integer Number of the cage in which the mouse lived.

Appendix B: The Case of Gene Editing

Genetic editing has the potential to save millions of people suffering from genetic diseases such as Down syndrome, etc. Gene editing could be applied to more than 10,000 conditions from sickle cell anemia, cystic fibrosis or some cases of early-onset Alzheimer’s (Belluck, 2017). It also has the potential to save people dying from malaria by inserting an artificial gene into mosquitos thus making the mosquitos sterile and unable to spread malaria (Last Week Tonight, 2018).

Human trials are very rare and although results have been promising on plants and animals. However, as with a lot of scientific advancement, it is always met with cynicism. A case for this would be the test tube babies when it first came out. The potential risks and benefits of the method were wildy debated. As time progresses, people are now taking it for granted. We are bombarded by Hollywood with what currently seemed farfetched about the dystopian future gene editing would bring from movies such as Jurassic Park, Planet of the Apes and etc. Arguments against gene editing came from a lot of sides from ethical and moral, religious and to some extent scientific point of views.

Falling into the wrong hands, gene editing as with other technologies could create unwanted consequences. Science is like a double-edged sword with the potential to save or eradicate life. Gene editing may mess with the ecosystem with negative consequences. An example of messing with the delicate ecosystem can be found in Australia when in the 1930s, about 100 cane toads are introduced to control the cane beetles. The toad not only failed to control the beetles’ population, but they multiplied to hundreds of millions of cane toads instead and create a havoc (Last Week Tonight, 2018).

A cliché religious claim against this would be ‘are we playing god?’. These raises moral questions of potentially gene editing not only could fight diseases but be used as human enhancement. What constitutes as a disease may not be viewed similarly across the community. For many of the deaf community, deafness is not a disease and for many of the people suffering from dwarfism, the people with dwarfism themselves do not think they are sick or suffering but as a unique identity (Explained, 2018). The idea that we think people are suffering from these ‘diseases’ suggests that are we naïve or ignorant to want everyone to be similar and hence, raises another point of the meaning of being human. What does it mean to be human or to be unique?

There is a debate in the community of what the person with the ‘disease’ perceived their ‘disease’ as an identity as what makes them unique with other parts of the society who feels the traits to be a ‘flaw’ that needs to be corrected. Are we going to deprive humans of their own identity? This brings us to a famous case a few years ago when in 2002, a deaf lesbian couple in the US create a child who is deaf like them. The story goes as the deaf lesbian couple deliberately tried to conceive a child with the help of a sperm donor who has a history of deafness for five generation in his family (Teather, 2002).

We cannot ignore the fact that people with Down syndrome cannot be successful. Angela Bachiller, a person with Down syndrome is a Spanish city councilor for Valladolid and was sworn in on 29 July 2013 (Flanders, 2014).

Angela Bachiller

Angela Bachiller

Pablo Pineda is also a person with Down syndrome is a writer, speaker, and actor. He also earned a bachelor’s degree in educational psychology (Flanders, 2014). The list goes on for successful people with Down syndrome.

Pablo Pineda

Pablo Pineda

“There is something appealing, even intoxicating, about a vision of human freedom unfettered by the given. It may even be the case that the allure of that vision played a part in summoning the genomic age into being. It is often assumed that the powers of enhancement we now possess arose as an inadvertent by-product of biomedical advancement-the genetic revolution came, so to speak, to cure diseases, and stayed to tempt us with the prospect of enhancing our performance, designing our children, and perfecting our nature. That may have the story backward. It is more plausible to view genetic engineering as the ultimate expression of our resolve to see ourselves astride the world, the masters of our nature. But that promise of mastery is flawed. It threatens to banish our appreciation of life as a gift and to leave us with nothing to affirm or behold outside our own will” (Sandel, 2004).

References

Belluck, Pam (2017). In Breakthrough, Scientists Edit a Dangerous Mutation From Genes in Human Embryos. New York Times. Obtained from: https://www.nytimes.com/2017/08/02/science/gene-editing-human-embryos.html

Down Syndrome Australia (2018). Obtained from: https://www.downsyndrome.org.au

Flanders, Nancy (2014). 9 Successful People With Down Syndrome Who Prove Life is Worth Living. Obtained from: https://www.lifenews.com/2014/11/10/9-successful-people-with-down-syndrome-who-prove-life-is-worth-living/

Hadley Wickham (2018). forcats: Tools for Working with Categorical Variables (Factors). R package version 0.3.0. https://CRAN.R-project.org/package=forcats

Hadley Wickham (2018). stringr: Simple, Consistent Wrappers for Common String Operations. R package version 1.3.1. https://CRAN.R-project.org/package=stringr

HBO. (2018). Last Week Tonight by John Oliver [TV programme]. Gene Editing

Main Features - Height and weight. (2013, June 7). Retrieved September 10, 2018, from http://www.abs.gov.au/ausstats/abs@.nsf/lookup/4338.0main features212011-13

Nolan, D. and Speed, T. (2000). Stat Labs: Mathematical Statistics through Applications. Springer Verlag.

Posner, Joe (2018). Explained [TV programme]. Designer DNA

R Core Team (2018). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/.

Sandel, Michael J. (2004). The Case Against Perfection. Atlantic Monthly.

Teather D. Deaf baby designed by deaf lesbian couple. The Age 2002 Apr 9: news section: 3.

Yihui Xie (2018). knitr: A General-Purpose Package for Dynamic Report Generation in R. R package version 1.20.

Yihui Xie (2018). DT: A Wrapper of the JavaScript Library ‘DataTables’. R package version 0.4. https://CRAN.R-project.org/package=DT